Density-based Imputation Method for Fuzzy Cluster Analysis of Gene Expression Microarray Data

نویسندگان

  • Thanh Le
  • Tom Altman
  • Katheleen J. Gardiner
چکیده

Fuzzy clustering has been widely used for analysis of gene expression microarray data. However, most fuzzy clustering algorithms require complete datasets and, because of technical limitations, most microarray datasets have missing values. To address this problem, we present a new algorithm where genes are clustered using the Fuzzy C-Means algorithm (FCM). The fuzzy partition obtained is then used to create a density-based fuzzy partition which is used with the FCM fuzzy partition to estimate the missing values in the dataset. We show that our method outperforms five popular imputation algorithms on both artificial and real datasets. AvailabilityThe test datasets and the software are available online at http://ouray.ucdenver.edu/~tnle/fzdbi Keywordsmicroarray data; missing value estimation; fuzzy c-means; cluster density;

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Gene Identification from Microarray Data for Diagnosis of Acute Myeloid and Lymphoblastic Leukemia Using a Sparse Gene Selection Method

Background: Microarray experiments can simultaneously determine the expression of thousands of genes. Identification of potential genes from microarray data for diagnosis of cancer is important. This study aimed to identify genes for the diagnosis of acute myeloid and lymphoblastic leukemia using a sparse feature selection method. Materials and Methods: In this descriptive study, the expressio...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

Missing Value Estimation for DNA Microarray Expression Data: Least Squares Imputation

Motivation: Gene expression microarray data sets often contain missing expression values. Robust missing value estimation methods are needed since many algorithms for gene expression analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares and cluster structure are proposed to estimate missing values in the gene expression data, which...

متن کامل

CF-GeNe: Fuzzy Framework for Robust Gene Regulatory Network Inference

Most Gene Regulatory Network (GRN) studies ignore the impact of the noisy nature of gene expression data despite its significant influence upon inferred results. This paper presents an innovative Collateral-Fuzzy Gene Regulatory Network Reconstruction (CF-GeNe) framework for Gene Regulatory Network (GRN) inference. The approach uses the Collateral Missing Value Estimation (CMVE) algorithm as it...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012